Fast and scalable approximation methods for finding minimum cost flows with shared recovery strategies, and system using same

ABSTRACT

Broadly, techniques for solving network routing within a predetermined error are disclosed. These techniques may be applied to networks supporting dedicated reserve capacity, where reserved capacity on links in the network is dedicated for a particular commodity (generally, a source and sink pair of computers), and shared recovery, where reserved capacity on links is shared amongst two or more commodities. These techniques use an iterative process to determine flows on each of the links in a network. Costs are set for each commodity, and primary and secondary (i.e., backup) flows are initialized. A commodity is selected and demand for the commodity is routed through the shortest path. Costs are updated for each potential failure mode. For each commodity, the flows and costs are updated. Once all flows and costs are updated, then it is determined if a function is less than a predetermined value. If the function is less than a predetermined value, then the commodity selection, and flow and cost adjustments are again performed. If the function is greater than the predetermined amount, then the network routing problem is solved to within a predetermined amount from an optimal network routing.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No. 60/262,410, filed Jan. 18, 2001.

FIELD OF THE INVENTION

The present invention relates generally to network routing, and more particularly, to fast and scalable approximation methods for finding minimum cost flows with shared recovery strategies and systems using these methods.

BACKGROUND OF THE INVENTION

In a packet switching network, a packet of information is routed from a source to a destination by using a destination address contained in the packet. Packets of information are multiplexed onto network pathways. No particular path is reserved for data in a packet switching network, and a packet can generally take any path in the network. While a packet is not limited to any particular path, techniques exist for routing a packet based on certain criteria, such as speed, cost, distance, and Quality of Service (QoS).

An additional type of network that is becoming common is a label switching network. A label switching network runs in conjunction with a packet switching network, and uses labels placed into packets in order to route packets. Normal routing procedures for packet switching are disabled. A “tunnel” is created to connect a source with a destination. The appropriate labels are selected for packets emanating from the source and entering the network so that the packets follow the tunnel and end up at the destination. Label switching offers the benefit of providing tunnels for packets. Consequently, packets should arrive at the destination at more regular intervals, which is important in certain time-sensitive applications such as real-time audio and video.

Packet switching and label switching networks both are helped by methods that route demands on these networks. These methods can be used real-time, meaning that changes are made to the network in order to more efficiently route demands on the network, or can be used to determine when or how the network should be upgraded to support new demands. Problems with methods used to solve network routing are that they are complex and time-consuming. A need therefore exists for techniques that solve these problems.

SUMMARY OF THE INVENTION

Broadly, techniques for solving network routing within a predetermined error are disclosed. These techniques may be applied to networks supporting dedicated reserve capacity, where reserved capacity on links in the network is dedicated for a particular commodity (generally, a source and sink pair of computers), and shared recovery, where reserved capacity on links is shared amongst two or more commodities. These techniques use an iterative process to determine flows on each of the links in a network. As the flows for each link are modified, the costs for each link are also modified. The process continues to modify flows on links and costs for links until a network routing solution is determined that is within a predetermined amount of an optimal network routing. The network routing generally includes, for each link, a primary amount of flow that will normally flow on the link and a secondary flow that will be placed on the link should a failure occur in certain locations in the network. Once the network routing is determined, a process can configure the network in accordance with this routing, or a network administrator can use the network routing to determine when and how to upgrade the network.

In a first aspect of the invention, a technique is disclosed that performs the following steps. Costs are set for each commodity, and primary and secondary (i.e., backup) flows are initialized. A commodity is selected and demand for the commodity is routed through the shortest path. There are well known techniques for determining the shortest path that minimizes cost according to weights associated with the links. Due to certain assumptions in the present invention on the feasible paths for flows, shortest path computations are relatively easy. One extra complication, however, is that there are different edge costs associated to each failure mode. A failure mode comprises a given set of network elements. Typically, failure modes consisting of one link are generally modeled, but it may also be useful to protect against node failures or multiple failures. Network routing is achieved incrementally in rounds. Each round comprises sending flow for each commodity down its shortest paths (or pair of paths). These flows in turn affect the costs on the links, which in turn affects future choices of the shortest paths. After each round, a function is evaluated to check whether it is above a certain threshold. If so, the computation of an approximate solution has been completed.

In a second aspect of the invention, the technique of the first aspect is performed subject to a budget constraint. In a third aspect of the invention, capacity is reserved when determining appropriate network routing. In fourth and fifth aspects of the invention, the techniques of the present invention are used with Multi-Protocol Label Switching (MPLS) and Multi-Protocol Wavelength Switching (MPλS), respectively.

A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a block diagram of a system that used to explain network routing and recovery;

FIG. 2 illustrates a block diagram of an exemplary system used to illustrate aspects of the present invention;

FIG. 3 is a block diagram of an exemplary network controller in accordance with a preferred embodiment of the invention; and

FIG. 4 is a flow chart of a method of determining network routing without a budget constraint, in accordance with a preferred embodiment of the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Broadly, techniques are presented for determining network routing. These techniques support dedicated reserve capacity, where reserved capacity on links in the network is dedicated for a particular commodity (generally, a source and sink pair of computers), and shared recovery, where reserved capacity on links is shared amongst two or more commodities. Basically, the techniques are given a set of demands as inputs. Each demand is a requested amount of capacity to be carried on the network. Demands are created between a source device and a destination device. The network lies betweens the source and destination devices and must be able to carry the demand, subject to certain criteria. One such criterion is generally some amount of extra demand so that communication between the source and destination can continue if a link failure or other failure occurs over the primary path chosen for the link. This extra demand will be routed over a secondary path. The outputs of the techniques include the determined primary and secondary demand for each link in the network.

The techniques of the present invention may be used during real-time, which means that the primary and secondary demands determined as a network routing solution may be used to configure the network to actually route the determined demands. This is particularly true because the iterative methods of the present invention allow fairly fast convergence to a network routing solution, even for very large networks. Alternatively, the determined demands may be used to determine how or when to upgrade the network. For example, a network administrator may know that the current demand but may desire to know what the desired capacity or cost will be if the demand increases by a certain percentage. If the network administrator has data showing the approximate bandwidth increase per year, for example, for the network, the administrator can then determine when the capacity of the network will need to be increased based on the data.

Before proceeding with additional detail about the aspects of the present invention, it is helpful at this point to describe an exemplary system and possible routing problems in the system. Both packet and label switching networks have to route packets from a source to a destination. FIG. 1 helps to illustrate routing determination and possible problems. A system 100 is shown in FIG. 1. System 100 comprises two source devices 110, 120, two destination devices 160, 170, a network 170, and a network controller 180. Network 170 comprises a number of nodes 130-1, 130-2, 130-3, 140-1, 140-2, 140-3, 150-1, 150-2, 150-3, and 150-4. In this example, packets originate at source device 110, travel through link 115, are placed onto the network 170 by node 130-1, are removed from the network 170 by node 130-3, travel through link 165, and are received by destination device 160. Similarly, packets originate at source device 120, travel through link 125, are placed onto the network 170 by node 130-1, are removed from the network 170 by node 130-3, travel through link 175, and are received by destination device 170. For simplicity, paths taken by packets will be treated as one-way paths.

In system 100, there are three possible paths from which node 130-1 may select: (1) a path connecting nodes 140-1, 140-2, and 140-3 (collectively, “nodes 140”), having links 145-1, 145-2, 145-3, and 145-4 (collectively, “links 145”); (2) a path connecting nodes 130-1, 130-2, and 130-3 (collectively, “nodes 130”), having links 135-1 and 135-2 (collectively, “links 135”); and (3) a path connecting nodes 150-1, 150-2, 150-3, and 150-4 (collectively, “nodes 150”), having links 155-1, 155-2, 155-3, 155-4, and 155-5 (collectively, “links 155”). Each link has a certain capacity. Each node is generally a router or a device that performs the functions of a router.

Routing over a packet switching network 170 is normally relatively simple. When node 130-1 receives packets from link 115 or link 125, the node 130-1 determines how to route the packets based on the destinations, which are contained in the packets, and routing criteria, as discussed above. However, the routing criterion is usually solely shortest path, and most routers are oblivious to other criteria such as lowest cost and congestion. In the example of FIG. 1, it is assumed that the shortest path is through link 135-1, and packets from both source devices 110 and 120 are routed through this path. The node 130-2 will make a similar routing determination, again choosing the shortest path, which is through link 135-2. Consequently, routing tends to be a relatively simple affair, although more complex routing routines may be used, as discussed below.

Another important routing problem is providing backup paths in case a path though which data is currently being routed becomes unusable. For instance, packets from both source devices 110 and 120 might be routed through the path containing links 135. This path is called the primary path, and this path must, in this example, have enough capacity to support both source devices 110 and 120. For example, if source device 110 needs a capacity of 5 units and source device 120 needs a capacity of 7 units, then the path through links 135 must support at least a capacity of 12 units. The backup paths in this example are the paths through links 145 and links 155. These backup paths are called secondary paths, and capacity on these paths is reserved in case a failure occurs on the links 135 or nodes 130. In many systems, the capacity reserved on the secondary paths is at least the capacity on the primary paths. Therefore, in this example, 12 units of capacity must be reserved on the reserve paths having links 145 and 155. Assume that the reserved capacity on the path having links 145 is 5 units and the reserved capacity on the path having links 155 is 7 units. In this situation, the system 100 can respond to a failure of link 135 or node 130-2 by rerouting the packets from source device 110 through the path having links 145 and by rerouting the packets from source device 120 through the path having links 155. A system 100 or network 170 that provides reserve capacity for data is said to support recovery, and a system 100 or network 170 that reserves secondary capacity equivalent to the primary capacity is said to support 1-to-1 or 1 +1 protection.

The latter type of protection is fairly simple to implement in a network 170 using light pathways as the transmission medium. For instance, node 130-1 can simply sense if there is light on the link 135-1 and route the light pathways from link 135-1 to link 145-1 (for source device 110) and link 155-1 (for source device 120) if there is no light on link 135-1. While this type of protection is relatively simple, it takes forethought to configure the network 170 to automatically transfer capacity during failure. Additionally, the network 170 is usually limited to the predetermined primary and secondary pathways, and other secondary pathways, for instance, cannot be chosen without manual intervention. Finally, 1+1 protection reserves too much capacity and will cause a network to have to add capacity sooner than what would be necessary if less capacity is reserved.

One way for less capacity to be reserved is to allow secondary paths to be shared. For instance, assume the links 135 only support 5 units of capacity, while links 155 only support 7 units of capacity. Source device 110, which needs 5 units of capacity, is routed by node 130-1 through link 135-1, and source device 120, which needs 7 units of capacity, is routed by node 130-1 through link 155-1. If it is assumed that only one failure of a path will occur at a time, such that either the path having links 135 or the path having links 155 will fail but not both, then the reserved capacity can be 7 units of capacity. This allows the reserved path having links 145 to have 7 units of capacity reserved on the path, instead of the 12 units needed if both the path having links 135 and the path having links 155 fail at the same time. This recovery system is a shared recovery system.

A problem with shared recovery systems is that the analysis of a network can be quite complex. There are mathematical solutions that will determine an optimum routing for both primary and secondary paths for a number of sources and destinations. However, these mathematical solutions tend to take quite a while to solve and they tend to fail when the number of nodes, sources, and destinations exceeds a certain amount.

Moreover, as discussed above, many networks remain relatively fixed in their ability to have nodes switch paths. Nonetheless, new routers are becoming more prevalent, and the new routers support certain protocols that allow a remote device, such as network controller 180, to more easily configure routers to automatically switch to secondary paths when primary paths fail. Two important protocols that may be used to perform configuration functions are Multi-Protocol Label Switching (MPLS) and Multi-Protocol Wavelength Switching (MPλS). The latter is basically a specialized version of the former.

A network that supports MPLS is one type of label switching network. A label switching network runs in conjunction with a packet switching network, and uses labels placed into packets in order to route packets. The exact technique used to place the label into a packet depends on the underlying packet switching protocol being used. For example, the location for a label placed into a packet for a network using the Internet Protocol (IP) may be somewhat different than the location for a label for a packet for a network using Asynchronous Transfer Mode (ATM).

Regardless of the actual technique used to support labels in packets, normal routing procedures for packet switching are disabled in a label switching network. A “tunnel” is created to connect a source with a destination. The appropriate labels are selected for packets emanating from the source and entering the network so that the packets follow the tunnel and exit the network at the destination.

Even with these new protocols, the problem of determining how to efficiently and quickly route all of the requested capacity in a network while still providing shared recovery for large networks is still quite important. A need therefore exists for techniques that solve this problem.

Aspects of the present invention solve this problem by providing techniques that quickly converge to a network routing solution that is within a predetermined error from an optimal network routing solution.

The present invention is generally applicable to any technology or protocol that is path-oriented (as opposed to being link oriented) and that can perform explicit recovery, which means that the technology can transfer flow from one path to another. For example, Multi-Protocol Label Switching (MPLS) and Asynchronous Transfer Mode (ATM) are suitable technologies for implementing aspects of the present invention. The present invention is less applicable to Internet Protocol (IP), because IP is link (and not path) based. It should be noted, however, that MPLS and Multi-Protocol Wavelength Switching (MPλS) can be implemented over IP.

It should also be noted that the techniques of the present invention are scalable. Many techniques for determining network routing fail when the number of nodes passes a certain value. It can be shown that the present invention can be used with a very large number of nodes.

Broadly, the setting of shared protection is as follows. A network G=(V,E) is considered with a capacity u(e) and cost c(e) for each link e. The network G is modeled as an undirected graph. In graph theory terms, G is usually called a graph and a link e is called an edge. There is also a set, K, of commodities (or “demand-pairs”), each commodity kεK specified by a source-sink pair (s_(k), t_(k)), demand d_(k), and a collection Λ_(k) of pairwise link-disjoint paths, each of which connects s_(k) to t_(k). That is, no two paths in Λ_(k) have any common link. Any single link of G may fail at any time, thus rendering all paths that pass through it temporarily useless, until the link is made live again. All flows, however, for commodity k must be routed on one of the paths in Λ_(k).

One objective of the present invention is to design working and restoration capacity so that under certain failure states, there is still enough capacity to meet demand. Moreover, the long term operational cost of these traffic flows is to be minimized. To do this, statistical information on the relative likelihoods of the various links is assumed, and as a result, paths in Λ:=∪_(kεK)Λ_(k) failing (a path fails when one of its links fails). This information is updated as further failures occur, with old information discounted in order to have an accurate picture of the existing network. For each commodity k specified by its source-sink pair, (s_(k),t_(k)), what must be chosen is (a) appropriate primary flows on paths in Λ_(k), and (b) how much flow is transferred from one path to another in the event of a failure. This important requirement that there be sufficient flow even under any link failure is formalized below. Furthermore, under the normal state of no failure, as well as under the failure of any single link, the capacity of any link, e, should not be exceeded: the total demand using it should be at most u(e). An objective is to design the paths, flows, and flow transfer strategies under failure, in order to minimize the long term average cost of operating the network. This average cost is just the expected long term cost of operating the network under a given collection of primary and backup paths, where the random variables are the failures of the various paths.

Referring now to FIG. 2, an exemplary system 200 is shown. System 200 comprises k sources 210-1 through 210-k (collectively, “sources 210”), k destinations 240-1 through 240-k (collectively, “destinations 210”), network 270, and a network controller 260. Network 270 comprises a plurality of nodes, of which nodes 220-1 through 220-4 (collectively, “nodes 220”), nodes 230-1 through 230-3 (collectively, “nodes 230”), and nodes 250-1 through 250-3 (collectively, “nodes 250”) are shown. Although only nodes 220, 230, 250 are shown, generally a network 270 using techniques of the present invention will contain many such nodes. The nodes that are shown are merely exemplary. The nodes are interconnected through links (also called edges). For instance, node 220-1 is connected to node 220-2 through link 221-1. Each of the links 221-1 through 221-3 (collectively, “links 221”), link 232-1, link 232-2, links 251-1 and 251-2 (collectively, “links 251”), links 2311 and 231-2 (collectively, “links 231”) have a capacity u(e) and cost c(e).

Each commodity k, which is specified by a source-sink pair (s_(k),t_(k)) such as source-sink pair (210-1, 240-1), has a demand d_(k). In the example of FIG. 2, the source-sink pair (210-1, 240-1) is routed over a path comprising nodes 220 and links 221. The source-sink pair (210-1, 240-1) has demand d₁. Similarly, source-sink pair (210-k, 240-k) is routed over a path comprising nodes 230 and links 231. Capacity is reserved on links 222-1, 250, and 222-2 in order to route the demand d₁ if one of the links 221 fails (or, equivalently, if a node 220-2 or 220-3 fails). Similarly, capacity is reserved on links 232-1, 250, and 232-2 in order to route the demand d_(k) if one of the links 231 fails (or, equivalently, if a node 220-2 or 220-3 fails). This example is a shared recovery network, although the techniques of the present invention may also be used to determine network routing for dedicated reserve capacity, as discussed below.

In order to route the demand, network controller 260 uses a number of methods to determine how much demand should be routed on each link (called the primary capacity) and how much capacity should be reserved on each link in case a link on which demand is being routed fails (called secondary or reserve capacity). The network controller 260 then acts to configure nodes 220, 230, and 250 of the network 270 in order to bring about the determined primary and secondary capacities.

In a preferred embodiment, all or a portion of network 270 employs label switching. This label switching is generally controlled though Multi-Protocol Label Switching (MPLS), which can include Multi-Protocol Wavelength Switching (MPλS), and its control protocols. A good introduction to MPLS and MPλS is given in Jarram and Farrel, “MPLS in Optical Networks: An Analysis of the Features of MPLS and Generalized MPLS and Their Application to Optical Networks, with Reference to the Link Management Protocol and Optical UNI,” white paper from Data Connection, 100 Church St., Enfield, UK, (October, 2001), the disclosure of which is hereby incorporated by reference.

As discussed briefly above, MPLS is a new and promising routing scheme that aims to provide end-to-end quality of service (QoS) in otherwise best-effort packet networks. The attraction of MPLS is due to its combined high performance and efficient layer-2 switching with the intelligence of layer-3 Internet protocol (IP) control. By creating and pinning down tunnels, or Label-Switched (explicit) Paths (LSPs) with well-defined bandwidths, MPLS makes it possible to carry QoS-sensitive applications such as real-time audio and video in an IP network. See Rosen et al., “Multiprotocol Label Switching Architecture, Internet Engineering Task Force Request for Comments 3031 (IETF RFC 3031) (January 2001), the disclosure of which is hereby incorporated by reference. By identifying a wavelength with a label, this protocol has been extended to provide a complete IP-based mechanism to set up, manage, reconfigure and tear down wavelength tunnels in the optical core of networks. MPλS, as this protocol is known, makes it possible to add intelligence and reconfigurability to the traditionally static optical core of networks. See Ashwood-Smith et al., “Generalized MPLS—Signaling Functional Description,” IETF draft (November 2001) and IEEE Communications Magazine (December 1999), the disclosures of which are hereby incorporated by reference. Intelligent routing and reconfigurability are the key requirements of the emergent optical Internet.

To provide these desired capabilities in an optical network, traffic loads between distinct source-destination pairs are measured and abstracted, and the emerging set of demands aggregated, groomed and source-destination loads with bit-rates at or close to a wavelength are created. These are the set of loads that need to be provisioned over an existing network, with both primary and backup wavelength-switched paths (also known as λ-Switched Paths or λSP tunnels) for protection. Since only a small number of backup tunnels are typically invoked at a time, there is considerable room for their capacity sharing. For instance, if a network is to protect against single-link failures and if primary tunnels P₁ and P₂ do not share any links, then their respective backup paths can share λs (since P₁ and P₂ should not fail simultaneously). Efficient management of backup paths such as MPλS tunnels in a network, can result in up to fifty percent increase in the throughput of a network. For the latter, see Davis et al., “SPIDER: A Simple and Flexible Tool for Design and Provisioning of Protected Light-Paths in Optical Networks,” Bell Labs Technical Journal, Vol. 6, No. 1 (January 2001), the disclosure of which is hereby incorporated by reference.

Referring now to FIG. 3, an exemplary network controller 160 is shown interacting with network 270 and Digital Versatile Disk (DVD) 370. Network controller 160 comprises a processor 310, a memory 320, a media interface 330, and a network interface 340. Memory 320 comprises a network routing determination process 345, routing information 350, and a network configuration process 360. Network interface 340 communicates with network 270 and with processor 310 and memory 320. Note that many networks will contain other layers of software and hardware that are not shown but are known to those skilled in the art. For example, the Internet Protocol (IP) contains multiple layers, some of which are used to access hardware and others are strictly software. Media interface 330 is used to allow processor 310 to access DVD 370 or other types of media, such as hard drives or removable drives or optical storage means.

Network routing determination process 345 uses demands for commodities and determines appropriate routing for links. Network routing determination process 345 uses an ε-approximation technique, which means that the optimal network routing is not determined. Instead, an approximation to the optimal network routing is used, and this approximation is within ε away from the actual network routing. By making ε very small, the techniques of the present invention may be used to come arbitrarily close to the actual network routing such that, in real-world terms, the ε-based and actual network routings are indistinguishable. However, with decreases in ε, there are corresponding increases in processing time. Technologies such as the Internet and IP can experience rapid changes in demand, and it is beneficial to use higher values of ε in order for the methods of the present invention to quickly converge to a solution. In addition, one may compute, ahead of time, an upper bound on the number of iterations require to obtain the desired accuracy. This gives an estimate of how long to expect the algorithm to run. Moreover, current Internet routing techniques are poor in this regard, and are often 30-40 percent away from the optimum network routing for a particular set of commodities and demands.

The routing information determined from network routing determination process 345 is stored in routing information 350, and this information will usually comprise a list of links, how much demand should be routed over each link, and how much demand should be reserved on each link. The reserved capacity on each link can be shared capacity, dedicated capacity, or a combination of both of these. Routing information 350 may also contain additional data, such as cost for each link.

Network configuration process 360 uses the routing information 350 to configure network 270 to support the primary and secondary flows of the routing information 350. As is known in the art, routers generally contain routing tables that are used to route incoming packets to outgoing ports. These tables can be modified to set up the network 270 in accordance with the routing information 350. Similar tables are used for MPLS and MPλS, and these tables may be changed through certain control protocols. See, for instance, the Link Management Protocol section of Jerram, which has been incorporated by reference above. Additionally, the routing information 350 may be simply output to a screen or file in a manner suitable for a network administrator to view and determine network routing.

As described in detail below, network routing determination process 345 can contain budgeting criteria or have no budgeting criteria. Network routing determination process 345 can employ shared recovery, where no reserve capacity is dedicated, or employ dedicated reserve capacity for each commodity. In FIG. 4, a method is disclosed for determining network routing without a budget constraint, and, in FIG. 5, a method is disclosed for determining network routing with a budget constraint. Other methods are discussed below.

As is known in the art, the methods and apparatus discussed herein may be distributed as an article of manufacture that itself comprises a computer system-readable medium having computer system-readable code means embodied thereon. The computer system-readable code means is operable, in conjunction with a device such as network controller 160, to carry out all or some of the steps to perform the methods or create the apparatuses discussed herein. The computer system-readable medium may be a recordable medium (e.g., floppy disks, hard drives, memory cards, or optical disks, such as DVD 370, which is accessed through medium interface 330) or may be a transmission medium (e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel). Any medium known or developed that can store information suitable for use with a computer system may be used. The computer system-readable code means is any mechanism for allowing a computer system to read instructions and data, such as magnetic variations on a magnetic medium or height variations on the surface of a compact disk, such as DVD 370.

Memory 320 configure its processor 310 to implement the methods, steps, and functions disclosed herein. Memory 320 memories could be distributed or local and the processor 310 could be distributed or singular. Each memory could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. Moreover, the term “memory” should be construed broadly enough to encompass any information able to be read from or written to an address in the addressable space accessed by a processor 310. With this definition, information on a network (e.g., wired network 270 or a wireless network) is still within a memory, such as memory 320, because the processor, such as processor 310, can retrieve the information from the network. It should be noted that each distributed processor that makes up a distributed processor generally contains its own addressable memory space. It should also be noted that some or all of network controller 160 can be incorporated into an application-specific or general-use integrated circuit. For example, the network routing determination process 345 may be made into an application-specific integrated circuit.

The problem of determining flow in a graph, such as system 200 of FIG. 2, is now modeled more formally. A graph, G=(V,E), is considered with edge capacities u(e), costs c(e); a set of commodities K, each commodity k specified by a source-sink pair (s_(k),t_(k)) with demand d_(k); and a set of network states Q. Each state corresponds to a potential set of network element failures. It is assumed that there is a state, q₀∈Q, that is the normal state when no failures have occurred. With each commodity, k, is a corresponding set of disjoint paths Λ_(k) from s_(k) to t_(k). By “disjoint,” it is meant that no two paths in Λ_(k) have any common link. What is sought is enough reserved capacity to route all demands d_(k) on paths in Λ_(k), while respecting edge capacities so that, if some network failure occurs affecting a subset of paths in Λ:=∪_(kεK)Λ_(k), there is enough reserve capacity to reroute any disrupted flow. The cost of a solution is the sum over all links eεE of their costs, c(e), times the expected amount of flow on that edge e (over time).

First, the best fractional solution possible is determined. To do this, the problem is formatted as a linear program. Variable x(P) denotes the amount of flow on path P. Variable y^(p′)(P) denotes the amount of flow that is rerouted from path P′ to path P when a link on path P′ fails. Variable y^(P′)(P) is defined if and only if the following are true: (a) P≠P′ and (b) P and P′ are paths for the same commodity. When failure q occurs, the set of affected paths is denoted by Λ(q) (so Λ(q₀)=Ø). Let Q_(k) ⊂Q denote the set of all failures that affect some path in Λ_(k). The formulation used herein assumes that |Λ_(k)∩Λ(q)|≦1 for all k, q . That is, it is assumed that a failure does not affect more than one path in Λ_(k).

For each path P εΛ, let κ(P) denote the steady-state proportion of time for which P is in non-failed mode. As mentioned above, these probabilities κ(P) are continuously updated and learned. To model the objective function, assign, for each commodity k and paths P₁,P₂εΛ_(k), a cost c(P₁,P₂) as follows: c(P,P)=κ(P)Σ_(eεP)c(e), and if P₁≠P₂, c(P₁,P₂)=(1−κ(P₁))Σ_(eεP) ₂ c(e). Thus, c(P,P) is the (long term) expected cost of primary flow on path P, and c(P₁,P₂), for P₁≠P₂, is the (long term) expected cost of backup flow on P₂ from P₁. Henceforth, set c(P):=c(P,P). So, by the linearity of expectation, the objective function of long term average cost is:

$\begin{matrix} {{{EC}\left( {x,y} \right)} = {\sum\limits_{k}{\sum\limits_{P \in \Lambda_{k}}{\left\lbrack {{{c(P)}{x(P)}} + {\sum\limits_{P_{1} \in {\Lambda_{k}\backslash P}}{{c\left( {P_{1},P} \right)}{y^{P_{1}}(P)}}}} \right\rbrack.}}}} & (1) \end{matrix}$

The problem, called Minimize Operational Cost (MOC), may now be formulated as the following Linear Program (LP):

-   -   min EC(x,y)         subject to the following:

${\sum\limits_{P \in \Lambda_{k}}{x(P)}} \geq {d_{k}\mspace{14mu}{\forall{k \in K}}}$ ${{\sum\limits_{P \in {\Lambda_{k}\backslash P^{\prime}}}\left\lbrack {{x(P)} + {y^{p^{\prime}}(P)}} \right\rbrack} \geq {d_{k}\mspace{14mu}{\forall{P^{\prime} \in \Lambda_{k}}}}},{\forall{k \in K}}$ ${{\sum\limits_{k \in K}{\sum\limits_{\underset{P \in {\Lambda_{k}\backslash{\Lambda{(q)}}}}{P:{e \in P}}}\left\lbrack {{x(P)} + {\sum\limits_{P^{\prime} \in {\Lambda_{k}\bigcap{\Lambda{(q)}}}}{y^{P^{\prime}}(P)}}} \right\rbrack}} \leq {u(e)\mspace{14mu}{\forall{e \in E}}}},{\forall{q \in Q}}$ x, y ≥ 0

The first constraint above says that under no-failure conditions, the total demand d_(k) should be met for each commodity k. The second constraint says that this demand-satisfaction should hold even if any path P′εΛ_(k) fails. The third constraint is that under any network state q (including the no-failure state q₀), the total flow on any link e should be at most its capacity. The final trivial constraints are also shown, which force all variables to be non-negative.

This linear program (MOC) will be very large for moderate sized networks and a moderate number of failure scenarios. This linear program may be used to model how to route traffic in the Internet. Since the Internet changes frequently, this problem needs to be solved quickly. Exact linear programming codes, which will determine an exact solution to the problem, are too slow. Since the networks normally dealt with typically have reserve capacity, approximate solutions are suitable. Aspects of the present invention adapt approximate LP techniques to obtain a provably good solution to this linear program. For multicommodity flow problems, e-approximation methods have proven to be computationally effective in practice. Thus, these methods can be adapted to solve this problem.

1. Maximum Concurrent Recovery Problem

To start, it is determined if there is a feasible solution. Thus, the cost objective function is exchanged with the objective to maximize the minimum fraction of each demand that is met. This involves modifying the demand constraints as well. A new variable λ is introduced that represents the minimum fraction of demand routed, over all commodities, and the variables ƒ(e) are replaced with the given capacities u(e).

Since it is assumed that |Λ_(k)∩Λ(q)|≦1 for all k, q, the second set of inequalities in (MAC) can be simplified so that there is precisely one inequality for every P′εΛ. For clarity, this will be written as ∀P′εΛ_(k), ∀kεK. The summation in the third set of inequalities in (MAC) is also broken down by commodities. This linear program is then as follows (note that the maximization of λ formulation is copied from the (MOC) above).

-   -   maxλ         subject to the following:

${\sum\limits_{P \in P_{k}}{x(P)}} \geq {\lambda\; d_{k}{\forall{k \in K}}}$ ${{\sum\limits_{P \in {\Lambda_{k}\backslash P^{''}}}\left\lbrack {{x(P)} + {y^{P^{\prime}}(P)}} \right\rbrack} \geq {\lambda\; d_{k}{\forall{P^{\prime} \in \Lambda_{k}}}}},{\forall{k \in K}}$ ${{\sum\limits_{k \in K}{\sum\limits_{\underset{P \in {\Lambda_{k}\backslash{\Lambda{(q)}}}}{P:{e \in P}}}\left\lbrack {{x(P)} + {\sum\limits_{P^{\prime} \in {\Lambda_{k}\bigcap{\Lambda{(q)}}}}{y^{P^{\prime}}(P)}}} \right\rbrack}} \leq {u(e)\mspace{14mu}{\forall{e \in E}}}},{\forall{q \in Q}}$ x, y, λ ≥ 0. The dual of this linear program is

$\min{\sum\limits_{e \in E}{{u(e)}{\sum\limits_{q \in Q}{h^{q}(e)}}}}$ ${{\sum\limits_{q:{P^{\prime} \in {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}} \geq w^{P^{\prime}}},{\forall{k \in K}},{\forall{P^{\prime} \in \Lambda_{k}}},{P \in {\Lambda_{k} - \left\{ P^{\prime} \right\}}}$ ${{\sum\limits_{q:{P \notin {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}} \geq {z_{k} + {\sum\limits_{P^{\prime} \in {\Lambda_{k} - {\{ P\}}}}w^{P^{\prime}}}}},{\forall{P \in \Lambda_{k}}},{\forall{k \in K}}$ ${{\sum\limits_{k \in K}{d_{k}z_{k}}} + {\sum\limits_{k \in K}{\sum\limits_{P^{\prime} \in \Lambda_{k}}{d_{k}w^{P^{\prime}}}}}} \geq 1$ h, w, z ≥ 0.

It is beneficial to define new dual variables, Z_(k)=z_(k)+Σ_(P′εP) _(k) w^(P′). Intuitively, the dual variable h^(q)(e) represents the marginal cost of link e when the network is in state q. Following this intuition, w^(P′) is the cost of the cheapest recovery path that could be taken if path P′ were to fail. This means Z_(k) is the cost of the cheapest backup path for commodity k, including in this the cost of the appropriate backup path in case of failure. The dual linear program may then be rewritten as follows:

$\min{\sum\limits_{e \in E}{{u(e)}{\sum\limits_{q \in Q}{h^{q}(e)}}}}$ ${{\sum\limits_{q:{P^{\prime} \in {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}} \geq {w^{P^{\prime}}\mspace{14mu}{\forall{k \in K}}}},{\forall{P^{\prime} \neq P^{\prime}}},{P^{\prime} \in \Lambda_{k}},{P \in \Lambda_{k}}$ ${{w^{P} + {\sum\limits_{q:{P \notin {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}}} \geq {z_{k}\mspace{14mu}{\forall{P \in \Lambda_{k}}}}},{\forall{k \in K}}$ ${\sum\limits_{k \in K}{d_{k}Z_{k}}} \geq 1$ x, y, λ ≥ 0 1.1 An Exemplary Method for Determining an Approximate Solution

A preferred method for determining an approximate solution to network routing will be discussed in the following manner. First, the method will be described in succinct terms and through a figure. Next, a more elaborate pseudocode version of the method will be presented. Finally, analyses and proofs will be presented.

Referring now to FIG. 4, a method 400 is shown for determining an approximate solution to network routing. Method 400 is performed by a network controller, although other computer systems may also perform the method. First, the method 400 starts with setting a dual solution h^(q)(e)=δ/u(e) (step 410 ), for an appropriately small δ, which will be given later as a function of ε, m, and |Q|, and a primal solution x=y=0 (step 415). Given any vector h, one can obtain w by setting w^(P′)equal to the minimum over all possible backup paths P for P′ of Σ_(q:P′∈Λ(q))Σ_(e∈P)h^(q)(e). Define h^(q)(e) as h^(q)(P):=Σ_(e∈P)h^(q)(e). Then define Z_(k) as follows.

$\begin{matrix} \begin{matrix} {Z_{k} \equiv {\min\limits_{P_{1} \in \Lambda_{k}}\left\lbrack {{\sum\limits_{q:{P_{1} \notin {\Lambda{(q)}}}}{h^{q}\left( P_{1} \right)}} + {\min\limits_{P_{2} \in {\Lambda_{k}\backslash P_{1}}}{\sum\limits_{q:{P_{1} \in {\Lambda{(q)}}}}{h^{q}\left( P_{2} \right)}}}} \right\rbrack}} \\ {= {\min\limits_{P_{1},{P_{2} \in {\Lambda_{k}:{P_{1} \neq P_{2}}}}}\left\lbrack {{\sum\limits_{q:{P_{1} \notin {\Lambda{(q)}}}}{h^{q}\left( P_{1} \right)}} + {\sum\limits_{q:{P_{1} \in {\Lambda{(q)}}}}{h^{q}\left( P_{2} \right)}}} \right\rbrack}} \end{matrix} & (2) \end{matrix}$ Note that what has been ensured is that all the dual equations, but that for λ, are satisfied. To satisfy the λ equation, simply divide all dual variables by α+Σ_(k∈K)Z_(k)d_(k) to obtain a feasible solution that satisfies Σ_(k)d_(k)Z_(k)=1.

This exemplary method cycles through the commodities. For each visit to commodity k (step 420), the method will push d_(k) flow along paths in Λ_(k). This pushing is preferably performed on a “cheapest” pair of paths P₁,P₂, which is a pair for which the primary flow on P₁ plus the restoration flow on P₂ is minimized (step 425). The method then pushes along P₁ an amount u determined as the minimum of three quantities. Let u(P) denote the bottleneck capacity of P:u(P):=min_(eεP)u(e). Then u is set equal min{u(P₁),u(P₂),d_(k)}. The method continues by pushing u along path P₂ and updates the primal and dual variables x,y,h as described in the pseudocode below. In particular, costs are updated (step 430) for each failure mode. The method 400 repeats this process until d_(k) units of flow have been pushed along primary paths, then proceeds to the next commodity (step 440=YES). In other words, if the amount being pushed through a primary path is less than the demand needed by the commodity (step 435=YES), then demand is again routed (step 420) and costs are updated (step 430). Otherwise (step 435=NO), another commodity is selected (step 440). In practice, it is beneficial to start the method by scaling all capacities so that max_(k)d_(k)≦min_(e)u(e). Thus, u is in effect always equal to d_(k), and the method sends flow exactly once per commodity per iteration.

The method stops when an ε-approximate solution is found. An ε-approximate dual solution is be used to verify the ε-approximate primal solution. It will be shown below that a sufficient criterion for this is when the value of the dual objective function is at least 1. The dual objective function for this example is defined above as

${\sum\limits_{e \in E}{{u(e)}{\sum\limits_{q \in Q}{h^{q}(e)}}}},$ which is being minimized in the linear program. The method ends when the value of the dual objective function is at least one (step 445=NO), else another commodity is selected (step 445=YES). Given h^(q) for each q, define D(h) as the value of the corresponding dual objective function. For an integer i, define D(i) to be the dual objective function value at the beginning of the i^(th) iteration; whenever the term “iteration” is used below, the outer “while” loop in the method is being described. Method 400 in pseudocode appears below.

Initialize h^(q)(e) = δ/u(e) ∀e ∈ E, ∀q ∈ Q Initialize x ≡ 0, y ≡ 0 while D(h) < 1 for k = 1 to |K| do d′ ← d_(k) while D(h) < 1 and d′ > 0 P₁, P₂ ← shortest disjoint path pair in Λ_(k) that minimizes Eq. (2) u ← min{d′, u(P₁), u(P₂)} x(P₁)←x(P₁) + u y^(P₁)(P₂) ← y^(P₁)(P₂) + u d′ ← d′ −u for q ∈ Q if P₁ ∉ Λ(q) do ${\forall{e \in P_{1}}},\left. {h^{q}(e)}\leftarrow{{h^{q}(e)}{e^{\frac{ɛu}{u{(e)}}}.}} \right.$ else do ${\forall{e \in P_{2}}},\left. {h^{q}(e)}\leftarrow{{h^{q}(e)}{e^{\frac{ɛu}{u{(e)}}}.}} \right.$ end for end while end for end while The procedure stops at the first iteration t for which D(t+1)≧1. In the first t−1 iterations, for every commodity k, there has been routed (t−1)d_(k) units of flow along primary paths and backup paths. This flow may violate capacity constraints. By scaling appropriately, the flow can be made feasible.

A lemma and a theorem are now discussed.

$\lambda > {\frac{t - 1}{\log_{e^{ɛ}}\left( \frac{1}{\delta} \right)}.}$

Proof: Consider link e. Consider the third primal inequality for some fixed (q,e). Whenever x(P) is increased for some (k,P) pair such that PεΛ_(k)\Λ(q) by some value u, the first “for” loop inside the inner while loop multiplies h^(q)(e) by e^(εu/(e)). Whenever y^(P′) (P) is increased for some (k,P,P′), such that PεΛ_(k)\Λ(q) and P′εΛ(q)∩Λ_(k) by some value u, the second “for” loop inside the inner while loop multiplies h^(q)(e) by e^(ε/u(e)). Thus, h^(q)(e) increases by a factor of e^(ε) with every u(e) units that use link e in state q. Since h^(q)(e)=δ/u(e) initially and is less than 1/u(e) at the end of iteration t−1, the flow through e is at most u(e)·log_(e) _(ε) 1/δ at the end of iteration t−1. Thus, dividing the primal solution at the end of iteration t−1 by log_(e) _(ε) 1/δ gives a feasible solution. This ends the proof of Lemma 1.

It should be noted that other exponential functions, instead of e^(εu/u(e)), may be used to modify h^(q)(e). However, it is likely that the function used to modify h^(q)(e) must be exponential in order for the method to provably converge to a solution.

Theorem 1.2: Suppose the optimal primal (and hence dual) solution value is at least 1. If this is not true, it is easily achievable by scaling the numerical values in the primal and dual problems, as pointed out in Garg et al., “Faster and Simpler Algorithms for Multicommodity Flow and Other Fractional Packing Problems,” IEEE Symp. on Foundation of Comp. Sci., 300-309 (1998), the disclosure of which is incorporated herein by reference. The primal solution at the end of the (t−1)st iteration divided by log_(e) _(ε) 1/δ is an ε′-approximate solution, for ε′=ce, for an appropriate constant c>0, and appropriate choice of δ.

Proof: Let h_(i) be the length functions at the beginning of the i^(th) iteration. Let h_(i,k) be the length functions before routing the k^(th) commodity in the i^(th) iteration. Let h_(i,k,s) be the length functions before routing the s^(th) pair of paths for commodity k in the i^(th) iteration.

Let Z_(k)(h) be the value of Z_(k) computed using the given values of h variables before the resealing necessary to make the dual feasible. Equivalently, Z_(k)(h) is the value of Eq. (2), where P₁,P₂ are the shortest disjoint-path pair which the method would select when presented with the length functions given. Let α(h) equal Σ_(kεK)d_(k)Z_(k)(h). The variable α is defined so that h divided by α(h) is a feasible dual solution with value D(h)/α(h).

After flow u has been routed on P₁ and P₂ as the s^(th) pair of primary and backup paths for commodity k, the bound shown below occurs for D(h_(i,k,s+1)). The bound is presented below using the following notational conveniences: S≡{(e,q):eεP ₁ ,P ₁∉Λ(q)}  (3) S′≡{(e,q):eεP ₂ ,P ₂εΛ(q)}  (4)

Using this notation, the bound is then the following:

$\begin{matrix} \begin{matrix} {{D\left( h_{i,k,{s + 1}} \right)} = {\sum\limits_{e \in Q}{{u(e)}{\sum\limits_{q \in Q}{h_{i,k,{s + 1}}^{q}(e)}}}}} \\ {= {{D\left( h_{i,k,s} \right)} + {\sum\limits_{{({e,q})} \in S}{{u(e)}{h_{i,k,s}^{q}(e)}\left( {{\mathbb{e}}^{ɛ\frac{u}{u{(e)}}} - 1} \right)}} +}} \\ {{\sum\limits_{{({e,q})} \in S^{\prime}}{{u(e)}{h_{i,k,s}^{q}(e)}\left( {{\mathbb{e}}^{ɛ\frac{u}{u{(e)}}} - 1} \right)}} \leq} \\ {{D\left( h_{i,k,s} \right)} + {\sum\limits_{{{({e,q})} \in S})}{{h_{i,k,s}^{q}(e)}\left( {{u\; ɛ} + {ɛ^{2}\frac{u^{2}}{u(e)}}} \right)}} +} \\ {{\sum\limits_{{({e,q})} \in S^{\prime}}{{h_{i,k,s}^{q}(e)}\left( {{u\; ɛ} + {ɛ^{2}\frac{u^{2}}{u(e)}}} \right)}},} \end{matrix} & (5) \end{matrix}$ where inequality (5) uses the fact that e^(a)≦1+a+a² for 0≦a≦1. For notational convenience, let

$\gamma \equiv {{\sum\limits_{e \in P_{1}}{\sum\limits_{q:{P_{1} \notin {\Lambda{(q)}}}}{h_{i,k,s}^{q}(e)}}} + {\sum\limits_{e \in P_{2}}{\sum\limits_{q:{P_{1} \in {\Lambda{(q)}}}}{h_{i,k,s}^{q}(e)}}}}$ $\gamma \equiv {{\sum\limits_{q:{P_{1} \notin {\Lambda{(q)}}}}{h_{i,k,s}^{q}\left( P_{1} \right)}} + {\sum\limits_{q:{P_{1} \in {\Lambda{(q)}}}}{h_{i,k,s}^{q}\left( P_{2} \right)}}}$ Thus, from Eq. (5), the following can be discerned: D(h _(i,k,s+1))≦D(h _(i,k,s))+uε(1+ε)·γ  (6) D(h _(i,k,s+1))≦D(h _(i,k,s))+uε(1+ε)Z(h _(i,k,s)),  (7) where inequality (6) uses the fact that u is upper-bounded by min{u(P₁),u(P₂)}, and inequality (7) follows from the definition of Z_(k) in Eq. (2).

Note that the length functions are non-decreasing. Then over all the path pairs chosen in the process of pushing d_(k) flow for commodity k, one can conclude the following: D(h _(i,k+1))≦D(h _(i,k))+ε(1+ε)d _(k) Z _(k)(h _(i,k+1))

Each iteration pushes d_(k) for every commodity. Extending this inequality over all values of k and again noticing that the length functions (and thus Z_(k)) are nondecreasing, it is possible to conclude the following:

${D\left( {i + 1} \right)} \leq {{D(i)} + {{ɛ\left( {1 + ɛ} \right)}{\sum\limits_{k \in K}{d_{k}{Z_{k}\left( h_{i + 1} \right)}}}}} \leq {{D(i)} + {{ɛ\left( {1 + ɛ} \right)}{{\alpha\left( h_{i + 1} \right)}.}}}$ Thus, a crucial inequality has been established here that may be used as in the analysis presented in Garg (incorporated by reference above), Section 5. Let β be the optimal dual (and thus primal) objective function value. Since the initial objective function value is at most m(|Q|+1)δ, through a very similar analysis to that in Garg, one can conclude that if β≧1 and if the method terminates in iteration t then

$\frac{\beta}{t - 1} \leq {\frac{ɛ\left( {1 + ɛ} \right)}{\left( {1 - ɛ} \right)\ln\frac{1 - ɛ}{m{Q}\delta}}.}$ Thus, setting δ=(m|Q|+1)/(1−ε))^(−1/ε) implies (using the same arguments as in Garg) that this gives a provably good approximate solution. This ends the proof of Theorem 1.1.

Run time analysis: Let n denote the number of nodes in G (as mentioned above, m denotes the number of links in G. Also recall that |Q|=m+1. Each augmentation requires O(m|Q∥Λ|²(log n)^(O(1))) time to find a shortest path. Similar to the analysis in Garg, Section 7, the number of iterations before the stopping criterion is met is O(ε²(|Λ|+m|Q|)(log n)^(O(1))). However, in applications of the present invention, typically d_(k)≦u(P), so that the shortest path calculation at each step is also a minimum cost flow computation. Since minimum cost flow computations require fewer iterations, this gives an improved bound on the run time of O(ε²m|Q∥Λ|³(log n)^(O(1))). For support that minimum cost flow computations require fewer iterations, see, for example, Garg, Section 5, or Grigoriadis et al., “Fast Approximation Schemes for Convex Programs With Many Blocks and Coupling Constraints,” Society of Industrial and Applied Mathematics (SIAM) J. on Optimization, 4:86-107 (1996), the disclosure of which is hereby incorporated by reference.

2. Concurrent Recovery With Budget Constraint

One motivation for studying the problem in the preceding section is to develop a framework for quickly obtaining feasible solutions to (MOC). In fact, network designers are often interested in integer solutions where every commodity is assigned one regular path and one backup path. This latter problem is NP-hard although one can obtain near-optimal solutions by rounding a solution to the linear program relaxation, e.g., one of the ε-approximate solutions.

This section addresses the full version of the problem (MOC) where the method of the previous section is modified to reserve bandwidth on links (not exceeding their capacity) which supports a routing and recovery solution as given in the previous section. The techniques of the previous section are only marginally suitable to obtain ε-approximate solutions to (MOC). Instead, in this section, an alternate approach is considered. Instead of searching for the minimum cost bandwidth, the view is taken in this section that bandwidth is paid for as the method progresses. In this setting, backup capacity is paid for only as often as failures dictate that this capacity be purchased. In this sense, a minimum cost multicommodity flow and recovery problem may be formulated as follows.

Assume that, for each commodity k and paths P₁,P₂εΛ_(k), there is given a cost c(P₁,P₂) whose interpretation is as follows. If P₁=P₂, then this is the per-unit cost of sending flow on the path. Otherwise, this is the per-unit cost of sending backup flow for P₁ on path P₂. Note that these constants may be hard-wired in advance to account for known statistical information about network failures. For instance, if it is possible to determine a value k(P) for each path P which indicates the percentage of time P is in a non-failed state, then setting c(P₁,P₂)=(1−k(P))Σ_(eεP) ₂ c(e) represents the total cost of backup flow on P₂ for P₁. Similarly, c(P,P)=k(P)Σ_(eεP) ₂ c(e) represents the total cost of primary flow on path P. Henceforth, set c(P):=c(P,P) and this notation should not be confused with its prior meaning. This new objective function is then:

$\begin{matrix} {\min\;{\sum\limits_{k}{\left( {{\sum\limits_{P \in P_{k}}{{c(P)}{x(P)}}} + {\sum\limits_{P_{1} \in {P_{k}\backslash P}}{{c\left( {P_{1},P} \right)}{y^{P_{1}}(P)}}}} \right).}}} & (8) \end{matrix}$

Lemma 2.1: Let (x*,y*,ƒ*) be a solution that minimizes Σ_(e)ƒ(e)c(e), and let ({tilde over (x)}, {tilde over (y)}, {tilde over (ƒ)}) be a solution that minimizes Eq. (8). If there is a constant 1≦ξ that c(P)≦ξc(P′) for all pairs P,P′εP_(k) and all choices of kεK, then Σ_(e)ƒ(e)c(e)≦(1+ξ)Σ_(e)ƒ*(e)c(e).

Proof: The following can be shown.

${\sum\limits_{e}{{\overset{\sim}{f}(e)}{c(e)}}} \leq {\sum\limits_{P}{{c(P)}\left\lbrack {{\overset{\sim}{x}(P)} + {\sum\limits_{P^{\prime}}{{\overset{\sim}{y}}^{P^{\prime}}(P)}}} \right\rbrack}} \leq {\sum\limits_{P}{{c(P)}\left\lbrack {{x^{*}(P)} + {\sum\limits_{P^{\prime}}{y^{\;^{*}P^{\prime}}(P)}}} \right\rbrack}} \leq {\left( {1 + \xi} \right){\sum\limits_{P}{{c(P)}{x^{*}(P)}}}} \leq {\left( {1 + \xi} \right){\sum\limits_{e}{{f^{*}(e)}{{c(e)}.}}}}$ Thus, optimizing one choice of objective function provides a solution for which the value of the second objective function is close to optimal. In practice, the solution obtained using the proxy objective function should be much closer than the worst case bound given by this analysis. In particular, the first, third, and fourth inequalities are likely to be far from tight. For the third inequality, this is because ξ may be large when c(P) is very small for some P, but in this case, the total contribution to Eq. (8) of the capacity reserved on P is also small. When c(P) is large for some commodity, then the other paths for that commodity should also be close in cost. In addition, since in the end the fractional solutions are rounded to obtain integer solutions, the gain in cost due to using the proxy objective function is small relative to the gain from rounding.

The minimum cost version need not be solved directly, but the maximum concurrent flow LP may be solved with an additional budget constraint. With an ε-approximate method for this problem, one can obtain an approximate minimum cost solution by using an efficient search method (such as binary search) to find the minimum cost budget.

The budgeted linear program (referred to as BCR, for Budgeted Concurrent Recovery) and the corresponding dual linear program (DBCR) are now presented.

${\sum\limits_{P \in P_{k}}{x(P)}} \geq {\lambda\; d_{k}{\forall{k \in K}}}$ ${{\sum\limits_{P \in {\Lambda_{k}\backslash P^{''}}}\left\lbrack {{x(P)} + {y^{P^{\prime}}(P)}} \right\rbrack} \geq {\lambda\; d_{k}{\forall{P^{\prime} \in \Lambda_{k}}}}},{\forall{k \in K}}$ ${{\sum\limits_{k \in K}{\sum\limits_{\substack{P:{e \in P} \\ P \in \;{\Lambda_{k}\backslash{\Lambda{(q)}}}}}\left\lbrack {{x(P)} + {\sum\limits_{P^{\prime} \in {\Lambda_{k}\bigcap{\Lambda{(q)}}}}{y^{P^{\prime}}(P)}}} \right\rbrack}} \leq {{u(e)}{\forall{e\; \in E}}}},{\forall{q \in Q}}$ ${\sum\limits_{k \in K}{\sum\limits_{P \in \Lambda_{k}}\left\lbrack {{{c(P)}{x(P)}} + {\sum\limits_{P^{\prime} \in {\Lambda_{k}\backslash P}}{{c\left( {P^{\prime},P} \right)}{y^{P^{\prime}}(P)}}}} \right\rbrack}} \leq B$ x, y, λ ≥ 0.

The dual (DCBR) of this linear program is the following:

${\min\;\varphi\; B} + {\sum\limits_{e \in E}{{u(e)}{\sum\limits_{q \in Q}{h^{q}(e)}}}}$ ${{{{c\left( {P^{\prime},P} \right)}\varphi} + {\sum\limits_{q:{P^{\prime} \in {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}}} \geq {w^{P^{\prime}}\mspace{14mu}{\forall{k \in K}}}},{\forall P^{\prime}},{P \in \Lambda_{k}},{P \neq P^{\prime}}$ ${{{{c(P)}\varphi} + {\sum\limits_{q:{P \notin {\Lambda{(q)}}}}{\sum\limits_{e \in P}{h^{q}(e)}}}} \geq {z_{k} + {\sum\limits_{P^{\prime} \in {\Lambda_{k} - {\{ P\}}}}{w^{P^{\prime}}\mspace{14mu}{\forall{P \in \Lambda_{k}}}}}}},{\forall{k \in K}}$ ${{\sum\limits_{k \in K}{d_{k}Z_{k}}} + {\sum\limits_{k \in K}{\sum\limits_{P^{\prime} \in \Lambda_{k}}{d_{k}w^{P^{\prime}}}}}} \geq 1$ φ, h, w, z ≥ 0 For ease of notation in the following, if ƒ is a linear function defined on ground set E and P⊂E, then ƒ(P) will be used to denote Σ_(eεP)ƒ(e). 2.1 An Exemplary Method for Budgeted Concurrent Recovery

The problem (BCR) is a mixed packing and covering linear program: it has form max λ, Px≦p, Cx≧λc, x≧0 for nonnegative matrices P and C and positive vectors p and c. Start with x,y≡0, h^(q)(e)=δ/u(e), and φ=δ/B. Given a singleton-vector pair (φ,h ), let μ_(P)(φ,h) and V_(P′,P)(φ,h) be defined as follows:

${{\mu_{P}\left( {\varphi,h} \right)\text{:} = {c(P)}\varphi} + {\sum\limits_{q:{P \notin {P{(q)}}}}{h^{q}(P)}}},{{v_{P^{\prime},P}\left( {\varphi,h} \right)\text{:} = {c\left( {P^{\prime},P} \right)}\varphi} + {\sum\limits_{q:{P^{\prime} \in {P{(q)}}}}{{h^{q}(P)}.}}}$

When Φ and h are clear from context, they are written as μ_(P) and V_(P′,P). The variables z_(k) and W^(P′) are obtained for P′εΛ_(k) by finding an optimal solution to the following new linear program, where the constraint (10) is supposed to hold for all P, P′εΛ_(k), P≠P′:

$\begin{matrix} {\max\mspace{11mu}{d_{k}\left\lbrack {Z_{k} + {\sum\limits_{P^{\prime} \in \Lambda_{k}}w^{P^{\prime}}}} \right\rbrack}} & (9) \\ {W^{P^{\prime}} \leq V_{P^{\prime},P}} & (10) \\ {{z_{k} + {\sum\limits_{P^{\prime} \in {\Lambda_{k} - {\{ P\}}}}w^{P^{\prime}}}} \leq {\mu_{P}\mspace{14mu}{\forall{P \in \Lambda_{k}}}}} & (11) \\ {z,{w \geq 0}} & (12) \end{matrix}$ Let Z_(k) be the optimal value of this problem; that is, Z_(k)=z_(k)+Σ_(PεΛ) _(k) W^(P). By dividing the solution (φ,h,w,z) by Σ_(k)d_(k)Z_(k), a feasible solution is obtained to (DBCR). Thus for now, the main concern is with only dual variables h and Φ.

As with the previous method, the method for the budgeted concurrent recovery problem cycles through the commodities. For commodity k, using current vectors h_(l,k-1) and φ_(l,d-1), an optimal solution to the following problem is found, and this solution is used to determine the update step in the current iteration (the second of the following three constraints is supposed to hold for all P′εΛ_(k)):

${\min{\sum\limits_{P \in \Lambda_{k}}{\mu_{P}{x(P)}}}} + {\sum\limits_{{({P,P^{\prime}})} \in \Lambda_{k}^{2}}{v_{P^{\prime},P}{y^{P^{\prime}}(P)}}}$ subject to the following:

$\begin{matrix} \begin{matrix} {{\sum\limits_{P \in \Lambda_{k}}{x(P)}} \geq d_{k}} \\ {{{{\sum\limits_{{P \in \Lambda_{k}};{P \neq P^{\prime}}}{x(P)}} + {\sum\limits_{{P \in \Lambda_{k}},{P \neq P^{\prime}}}{y^{P^{\prime}}(P)}}} \geq d_{k}}{x,{y \geq 0}}} \end{matrix} & (13) \end{matrix}$

Note that the linear program described by Eqs. (9)-(12) is the dual of Eq. (13). The optimal solutions to both problems take one of two forms. The structure of the optimal solution to both Eq. (13) and Eqs. (9)-(12) is described in Lemma 2.2. To begin, two functions are introduced that will be useful to demonstrate this. The first is an extension of the Z_(k) introduced in the previous section.

$\begin{matrix} {{\overset{\_}{Z}}_{k} = {{\min\limits_{P_{1},{P_{2} \in {\Lambda_{k}:{P_{1} \neq P_{2}}}}}\mu_{P_{1}}} + v_{P_{1},P_{2}}}} & (14) \end{matrix}$

To obtain the second, assume the P_(i)εΛ_(k) are indexed by increasing μ_(P) ₁ , value, so that μ_(P) ₁ ≦μ_(P) ₂ ≦ . . . .

$\begin{matrix} {{\hat{Z}}_{k} = {\min\limits_{2 \leq r \leq {\Lambda_{k}}}\left( {\sum\limits_{i = 1}^{r}{\mu_{P_{1}}/\left( {r - 1} \right)}} \right.}} & (15) \end{matrix}$ Let r_(k) be the value of r that determines {circumflex over (Z)}_(k) in this expression.

Lemma 2.2: The optimal solution to Eq. (13) has value equal to d_(k) min{ Z _(k), {circumflex over (Z)}_(k)}. If the optimal solution has value d_(k){tilde over (Z)}_(k) determined by P₁ and P₂,then it satisfies x(P₁)=y^(P) ¹ (P₂)=d_(k), with all other variables zero. Otherwise, it has value d_(k){tilde over (Z)}_(k) and there are r distinct paths in Λ_(k) such that

${x\left( P_{i} \right)} = \frac{d_{k}}{r - 1}$ for i=1, . . . , r, all other variables are zero.

Proof: {circumflex over (Z)}_(k)≦ Z _(k): In this case, it is claimed that

${x\left( P_{i} \right)} = \frac{d_{k}}{r_{k} - 1}$ for i≦r_(k) for r_(k) the value that determines {circumflex over (Z)}_(k), and x(P_(i))=0 for i>r_(k) is an optimal solution. This is clearly feasible and has value d_(k){circumflex over (Z)}_(k). Now consider the dual solution w(P_(i))={circumflex over (Z)}_(k)−μ_(i) if i≦r_(k) and Z_(k)=0. This clearly satisfies Eq. (11). By Lemma 2.3, w(P_(i))≧0 for all i, satisfying Eq. (12). Since {circumflex over (Z)}≦ Z _(k)≦μP_(i)+v_(P) _(i,) _(P) _(j) , then Eq. (10) is satisfied for all P,P′. Thus, the solution is feasible. Since its value is d_(k){circumflex over (Z)}_(k), matching the value of the above solution to Eq. (13), both primal and dual solutions are optimal.

Z _(k)≦{circumflex over (Z)}_(k): If Z _(k)=μ_(P) ₁ +v_(P) _(1,) _(P) ₂ set x(P₁)=y^(P) ¹ (P₂)=d_(k) and all other primal variables equal to 0. This solution is clearly feasible and has value d_(k) Z _(k). Consider the following dual solution. w(P _(i))=max{Z _(k)−μ_(P) ₁ ,0}∀i≦|Λ _(k)|  (16) and z_(k)= Z _(k)−Σ_(i)w(P_(i)). Since Z _(k)≦{circumflex over (Z)}_(k), Lemma 2.3 implies that Z_(k)≦μ_(p) for all w(P)=0, and hence this solution satisfies Eq. (11). By definition, w is nonnegative. Summing Eq. (16) over all i with w(P_(i))>0 yields Σw(P_(i))=r Z _(k)−Σμ_(P) _(i) , which implies that z_(k)= Z _(k)−Σw(P_(i))=Σμ_(P) _(i) −(r−1) Z _(k)≧Σμ_(P) _(i) −(r−1){circumflex over (Z)}_(k)≧0, where the second inequality follows from the definition of {circumflex over (Z)}_(k). Finally, Z _(k)≦μ_(P) _(i) +v_(P) _(i,) _(P) _(j) for all i and j. Hence Eq. (10) is satisfied, and the solution is feasible. It has value d_(k){circumflex over (Z)}_(k), matching the given solution to its dual Eq. (13), hence both are optimal.

Lemma 2.3: {circumflex over (Z)}_(k)≧μ_(P) _(i) for i≦r_(k) and {circumflex over (Z)}_(k)≦μ_(P) _(i) for i>r_(k).

Proof:

${{Let}\mspace{14mu}{{\hat{Z}}_{k}(r)}}:={\frac{1}{r - 1}{\sum\limits_{i = 1}^{r}{\mu_{P_{i}}.}}}$ For ease of notation, μ_(P) _(i) is written as μ_(i) for the remainder of the proof. The lemma follows by noting that {circumflex over (Z)}_(k)(r+1) is a convex combination of {circumflex over (Z)}_(k)(r) and μ_(r+1). Thus, μ_(i) for i>2 is included in definition of {circumflex over (Z)}_(k) if and only if μ_(i)≦{circumflex over (Z)}_(k)(i−1). Since μ_(i)≦μ_(j) for i<j, if μ_(i)≦{circumflex over (Z)}_(k)(i−1), then μ_(i)≦{circumflex over (Z)}_(k)(j) for all j>i.

Lemma 2.2 implies that the optimal solution to Eq. (13) may be described as a set of paths, each carrying an equal amount of flow. Without loss of generality, assume this set is {P₁, . . . , P_(r) _(k) }, where r_(k)≧2. (If Z _(k)<{circumflex over (Z)}, this set is {P₁,P₂} and r_(k)=2.) The amount of flow sent along each path is u where

$u:={\min{\left\{ {\frac{d_{k}}{r_{k} - 1},{\min_{i \leq r_{k}}{u\left( P_{i} \right)}},\frac{B}{\sum\limits_{i = 1}^{r_{k}}{c\left( P_{i} \right)}}} \right\}.}}$ This quantity u is set so that the left side of an inequality in (BCR) does not increase by more than the fixed value of the right side in any one step. In practice, Σ_(PεΛ) _(k) c(P) is significantly smaller than B, so that, when combined with scaling of capacities, u is determined by d_(k). The updates to the dual variables for each case are described in the method given below. The stopping criterion is the same as in Garg, which is incorporated by reference above.

Budget-LP (G = (Λ, E, c, u, d)) Initialize h^(q)(e) = δ/u(e) ∀e ∈ E, ∀q ∈ Q Initialize φ = δ/B Initialize x ≡ {right arrow over (0)}, y ≡ {right arrow over (0)} while D(h) < 1 for k = 1 to |K| do d′ ← d_(k) while D(h) < 1 and d′ > 0 $Z_{k} = {\min{\left\{ {\overset{\_}{Z_{k}},{\hat{Z}}_{k}} \right\}.}}$ Let  {P₁, P₂, …  , P_(r_(k))}  be  paths  achieving  the optimum Z_(k). $\left. C\leftarrow\frac{B}{\sum\limits_{i = 1}^{r_{k}}{c\left( P_{i} \right)}} \right.$ $\left. u_{0}\leftarrow{\min\;\left\{ {\frac{d^{\prime}}{r_{k} - 1},{\min_{{1 \leq i \leq r_{k}},{e \in P_{i}}}{u(e)}},C} \right\}} \right.$ $\left. \varphi\leftarrow{\varphi\mathbb{e}}^{\frac{{ɛu}_{0}}{C}} \right.$ d′ ← d′ − (r_(k) − 1)u₀ if Z_(k) = Z_(k) , $\begin{matrix} \left. {x\left( P_{1} \right)}\leftarrow{{x\left( P_{1} \right)} + u_{0}} \right. \\ \left. {y^{P_{1}}\left( P_{2} \right)}\leftarrow{{y^{P_{1}}\left( P_{2} \right)} + u_{0}} \right. \end{matrix}\quad$ for q ∈ Q if P₁ ∉ Λ(q) do ${\forall{e \in P_{1}}},\left. {h^{q}(e)}\leftarrow{{h^{q}(e)}{e^{\frac{{ɛu}_{0}}{u{(e)}}}.}} \right.$ else do ${\forall{e \in P_{2}}},\left. {h^{q}(e)}\leftarrow{{h^{q}(e)}{e^{\frac{{ɛu}_{0}}{u{(e)}}}.}} \right.$ end for else (Z_(k) = {circumflex over (Z)}_(k)), for i ≦ r_(k), x(P_(i)) ← x(P_(i)) + u₀ for e ∈ P_(i), q such that P_(i) ∉ Λ(q), $\left. {h^{q}(e)}\leftarrow{{h^{q}(e)}{{\mathbb{e}}^{\frac{{ɛu}_{0}}{u{(e)}}}.}} \right.$ end for end for end while end for end while 2.2 Analysis

The extended arguments for

$\lambda > \frac{t - 1}{\log_{e^{\in}}\left( {1/\delta} \right)}$ are similar to the previous arguments. It is now shown that the primal solution divided by this choice of λ after t iterations is indeed an ε-approximate solution.

Theorem 2.4: Suppose the optimal primal (and hence dual) solution value is at least 1. The primal solution at the end of the (t−1)st iteration divided by log_(e) _(ε) 1/δ is an ε′-approximate solution, for ε′=cε, for an appropriate constant c>0, and appropriate choice of δ.

Proof: Let h_(i) and Φ_(i) be the dual variables at the beginning of the i^(th) iteration. Let h_(i,k,s) and φ_(i,k,s) be the dual variables before routing the k^(th) commodity in the i^(th) iteration. Let h_(i,k,s) and φ_(i,k,s) be the dual variables before routing the s^(th) pair of paths for commodity k in the i^(th) iteration. Let D(i,k,s) be the value of the dual objective function using dual variables h_(i,k,s) and φ_(i,k,s).

Let Z_(k)(h,φ) be the value of Z_(k) computed using the given values of h and Φ before the rescaling necessary to make the dual feasible. Equivalently, Z_(k)(h,φ)=min { Z _(k)(h,φ),{circumflex over (Z)}_(k)(h,φ)}. Let α(h,φ) equal Σ_(kεK)d_(k)Z_(k)(h,φ). The variable α has been designed so that (h,φ) divided by α(h,φ) is a feasible dual solution with value D(h,φ)/α(h,φ). Let D(i,k,s) be the value of the dual solution with variables h_(i,k,s) and (φ_(i,k,s). Below, it is established that D(i,k,s+1)≦D(1,k,s)+uε(1+ε)Z(h _(i,k,s,φi,k,s))  (17) holds no matter how Z_(k) is determined. This is done by analyzing each case separately. Once Eq. (17) is established, the remaining argument mirrors the argument in the proof of Theorem 1.2, so this portion of the proof is omitted here.

Suppose Z_(k)(h_(i,k,s),φ_(i,k,s))= Z _(k)(h_(i,k,s),φ_(i,k,s)) Let S and S′ be as in Eqs. (3) and (4). After u is routed on P₁ and P₂ as the s^(th) pair of primary and backup paths for commodity k, the following may be determined:

$\begin{matrix} {{D\left( {i,k,{s + 1}} \right)} = {{B\;\varphi_{i,k,{s + 1}}} + {\sum\limits_{e \in \; E}{{u(e)}{\sum\limits_{q \in Q}{h_{i,k,{s + 1}}^{q}(e)}}}}}} \\ {= {{D\left( {i,k,s} \right)} + {B\;{\varphi_{i,k,s}\left( {e^{ɛ \cdot u \cdot {{({{c{(P_{1})}} + {c{(P_{2})}}})}/B}} - 1} \right)}} +}} \\ {{\sum\limits_{{({e,q})} \in S}{{u(e)}{h_{i,k,s}^{q}(e)}\left( {e^{ɛ\;{u/{u{(e)}}}} - 1} \right)}} +} \\ {\sum\limits_{{({e,q})} \in S^{\prime}}{{u(e)}{h_{i,k,s}^{q}(e)}{\left( {e^{ɛ\;{u/{u{(e)}}}} - 1} \right).}}} \end{matrix}$ Thus, letting γ≡c(P₁)+c(P₂), the following may be determined: D(i,k,s+1)≦D(i,k,s)+εuγφ_(i,k,s)+ε² u ²γ²/B+  (18)

$\begin{matrix} {{{\sum\limits_{{({e,q})} \in \; S}{{h_{i,k,s}^{q}(e)}\left( {{u\; ɛ} + {ɛ^{2}{u^{2}/{u(e)}}}} \right)}} + {\sum\limits_{{({e,q})} \in S^{\prime}}{{h_{i,k,s}^{q}(e)}\left( {{u\; ɛ} + {ɛ^{2}{u^{2}/{u(e)}}}} \right)}}} \leq {{D\left( {i,k,s} \right)} + {u\;{ɛ\left( {1 + ɛ} \right)}\varphi_{i,k,s}\gamma} + {u\;{ɛ\left( {1 + ɛ} \right)}{\sum\limits_{{({e,q})} \in {({S\bigcup S^{\prime}})}}{h_{i,k,s}^{q}(e)}}}}} & (19) \\ {\mspace{79mu}\begin{matrix} {= {{D\left( {i,k,s} \right)} + {u\;{{ɛ\left( {1 + ɛ} \right)}\left\lbrack {\mu_{P_{1}} + v_{P_{1}P_{2}}} \right\rbrack}}}} \\ {{= {{D\left( {i,k,s} \right)} + {u\;{ɛ\left( {1 + ɛ} \right)}Z\left( {h_{i,k,s},\varphi_{i,k,s}} \right)}}},} \end{matrix}} & (20) \end{matrix}$ where inequality (18) uses the fact that e^(a)≦1+a+a² for 0≦a≦1; inequality (19) uses the fact that

${u \leq {\min\left\{ {{u\left( P_{1} \right)},{u\left( P_{2} \right)},\frac{B}{{c\left( P_{1} \right)} + {c\left( P_{2} \right)}}} \right\}}};$ and inequality (20) follows from the determination of Z_(k) as in (14).

Next, suppose Z_(k)(h_(i,k,s),φ_(i,k,s))={circumflex over (Z)}_(k)(h_(i,k,s),φ_(i,k,s)). For ease of notation, the following are defined:

$\begin{matrix} \begin{matrix} {{S^{''} \equiv {\bigcup\limits_{j = 1}^{r}\left\{ {{\left( {e,q} \right):{e \in P_{j}}},{P_{j} \notin {\Lambda(q)}}} \right\}}};} \\ {{\psi \equiv {ɛ\;{u \cdot {\left( {\sum\limits_{j = 1}^{r}{c\left( P_{j} \right)}} \right)/\left( {r - 1} \right)}}}},{and}} \\ {\eta \equiv {u\;{\eta/{\left( {r - 1} \right).}}}} \end{matrix} & (21) \end{matrix}$ (Note that the union in Eq. (21) is a union of r pairwise disjoint sets, since the paths P_(j) are pairwise link-disjoint.) After u/(r−1) is routed in the i^(th) iteration in the s^(th) round for commodity k on the cheapest r_(k) paths determined by {circumflex over (Z)}k, the following may be developed:

$\begin{matrix} \begin{matrix} {{D\left( {i,k,{s + 1}} \right)} = {{B\;\varphi_{i,k,{s + 1}}} + {\sum\limits_{e \in E}{{u(e)}{\sum\limits_{q \in Q}{h_{i,k,{s + 1}}^{q}(e)}}}}}} \\ {= {{D\left( {i,k,s} \right)} + {B\;{\varphi_{i,k,s}\left( {e^{ɛ\;{u/C}} - 1} \right)}} +}} \\ {{\sum\limits_{{({e,q})} \in S^{''}}{{u(e)}{h_{i,k,s}^{q}(e)}\left( {e^{\eta/{u{(e)}}} - 1} \right)}} \leq} \\ {{\left. {{D\left( {i,k,s} \right)} + {\varphi_{i,k,s}\left( {\psi + {\psi^{2}/B}} \right)}} \right)} +} \\ {\sum\limits_{{({e,q})} \in S^{''}}{{h_{i,k,s}^{q}(e)}\left( {\eta + {\eta^{2}/{u(e)}}} \right)}} \end{matrix} & (22) \\ {\leq {{D\left( {i,k,s} \right)} + {{\eta\left( {1 + ɛ} \right)}{\sum\limits_{j = 1}^{r}{\mu\;{P_{j}\left( {\varphi_{i,k,s},h_{i,k,s}} \right)}}}}}} & (23) \end{matrix}$ where inequality (22) uses the fact that e^(a)≦1+a+a² for 0≦a≦1; inequality (20) uses the fact that u≦min{(r−1)u(P_(j)),C}; and inequality (24) follows from the determination of {circumflex over (Z)}_(k) as in (15), the fact that Z_(k)={circumflex over (Z)}_(k), and the choice of r and P_(j). 3. Commodity-Dedicated Capacity Reservation

In this section, a simplified version of (MOC) is considered that insists on dedicated reserve capacity for each commodity. In (MOC), capacity used by a backup path for a failure affecting commodity k could be used by a different commodity when a different failure occurs. This more sophisticated model allows for better utilization of network capacity. In current network models, however, this is not done. Instead, each commodity has dedicated reserve capacity. That is, if any path fails, then there is enough spare capacity reserved for that commodity to route all the demand for that commodity. Since there is no transfer of flow from one path to another in case of failure, another commonly used objective function is considered here: letting ƒ(e) be the total flow on link e, the total cost of the flow, i.e., Σ_(eεE)c(e)ƒ(e), is minimized. This problem and its variant, where the paths are chosen on the fly, has been studied by Bienstock and Muratore in the context of capacity expansion with integrality constraints. See Bienstock and Muratore, “Strong Inequalities for Capacitated Survivable Network Design Problems,” Math. Programming 89, 127-147 (2001), the disclosure of which is hereby incorporated by reference.

An ε-approximation scheme is described here to solve the corresponding linear program for this dedicated reservation problem by observing that this problem corresponds to the special case of the previous problem with all y variables forced to zero. In this case, Z_(k) is always determined by {circumflex over (Z)}_(k). Likewise, the h^(q)(e) variables are replaced by a single variable h(e), and z_(k)=0 so may also be omitted. This modification in linear program formulation is highlighted below. The method and analysis then follow from the previous section, so are omitted. The dedicated capacity concurrent recovery problem:

$\begin{matrix} {{\min\;{\sum\limits_{e \in E}{{c(e)}{f(e)}}}}{{{\sum\limits_{P \in {\Lambda_{k}\backslash{\{ P^{\prime}\}}}}{x(P)}} \geq {d_{k}\mspace{14mu}{\forall{k \in K}}}},{P^{\prime} \in \Lambda_{k}}}} & (25) \end{matrix}$

${\sum\limits_{P:{e \in P}}{x(P)}} \leq {{f(e)}\mspace{14mu}{\forall{e \in E}}}$ f(e) ≤ u(e)  ∀e ∈ E x(P) ≥ 0

It is possible to remove the variables ƒ(e) from this formulation to obtain an equivalent formulation:

$\min{\sum\limits_{P \in \Lambda}{{c(P)}{x(P)}}}$ ${{\sum\limits_{P \in {\Lambda_{k}\backslash{\{ P^{\prime}\}}}}{x(P)}} \geq {d_{k}\mspace{14mu}{\forall{k \in K}}}},{P^{\prime} \in {P_{k}\Lambda}}$ ${\sum\limits_{P:{e \in P}}{x(P)}} \leq {{u(e)}\mspace{14mu}{\forall{e \in E}}}$ x(P) ≥ 0

This is solved by solving the concurrent flow version with a budget constraint and then searching for the optimal budget. The corresponding primal and dual LP's are

-   -   max λd_(k)

${{\sum\limits_{P \in {\Lambda_{k}\backslash{\{ P^{\prime}\}}}}{x(P)}} \geq {\lambda\mspace{14mu}{\forall{k \in K}}}},{P^{\prime} \in \Lambda_{k}}$ ${\sum\limits_{P:{e \in P}}{x(P)}} \leq {{u(e)}\mspace{14mu}{\forall{e \in E}}}$ ${\sum\limits_{P \in \Lambda}{{c(P)}{x(P)}}} \leq B$ x(P) ≥ 0

The dual LP is as follows, where the first constraint is supposed to hold for all kεK and for all PεΛ_(k):

${\min\mspace{14mu} B\;\phi} + {\sum\limits_{e \in E}{{u(e)}{h(e)}}}$ ${{{c(P)}\varphi} + {\sum\limits_{e \in P}{h(e)}}} \geq {\sum\limits_{P^{\prime} \in {\Lambda_{k}\backslash{\{ P\}}}}w^{P^{\prime}}}$ ${\sum\limits_{k \in K}{d_{k}{\sum\limits_{P \in \Lambda_{k}}w^{P}}}} \geq 1$ l, w ≥ 0 Substituting Z_(k)=Σ_(PεΛ) _(k) w^(P), then the dual can be rewritten as follows (once again, with the first constraint is supposed to hold for all kεK and for all PεΛ_(k)):

${\min\mspace{14mu} B\;\varphi} + {\sum\limits_{e \in E}{{u(e)}{h(e)}}}$ w^(P) + c(P)φ + h(P) ≥ Z_(k) ${\sum\limits_{k \in K}{d_{k}Z_{k}}} \geq 1$ l, w ≥ 0

As mentioned above, the algorithm and analysis now follow from the previous section.

It is to be understood that the embodiments and variations shown and described herein are merely illustrative of the principles of this invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. 

1. A method for determining routing in a network to achieve an objective value within a prescribed bound from its minimum value, the network comprising a plurality of nodes interconnected through links, the method comprising: concurrently routing demands for each of a plurality of commodities on a set of paths having a minimum cost with respect to an iteratively changing cost function, each set of paths comprising at least one primary path and a secondary path, wherein each of the demands is routed from a primary path to a secondary path of the set during a failure; adjusting link costs using an exponential function based on an amount of flow through links over which each demand is routed and based on said at least one primary path and said secondary path; performing the step of adjusting for each of a number of potential failures; and iterating the steps of routing, adjusting, and performing until an objective value is reached which is within a prescribed bound of a pre-determined value, such that flow for each of the links in the network is determined.
 2. The method of claim 1, wherein the step of routing further comprises the step of minimizing a function that represents a marginal cost of a link when the network is in a particular state, wherein the function is minimized for both the at least one primary path and the at least one secondary path.
 3. The method of claim 1, wherein: the step of routing further comprises the step of routing a flow for one of the commodities on a set of paths having a minimum cost, the set of paths comprising at least one primary path and at least one secondary path, wherein the flow is routed from a primary path to a secondary path during a failure; the step of adjusting further comprises the step of adjusting a minimum total cost using an exponential function based on an amount of flow through links over which the flow is routed; and the method further comprises the step of iterating the steps of routing and adjusting until the demand for the commodity is routed.
 4. The method of claim 1, wherein the step of performing the step of adjusting further comprises the step of determining a backup flow strategy comprising specifying, for each failure, how much flow for a primary path gets re-routed to one or more secondary paths.
 5. The method of claim 4, wherein the backup flow strategy comprises allowing secondary paths to be shared, secondary paths to be dedicated, or secondary paths to be shared and dedicated.
 6. The method of claim 4, wherein the objective value is a total expected cost of flow in the network over a predetermined time period, wherein the expected cost is taken over a probability distribution that includes the failures, and wherein the backup flow strategy is created wherein flows for any failure is recovered by routing the flows through secondary paths.
 7. The method of claim 1, further comprising the step of computing a number of iterations after which the objective value is within a specified tolerance from an optimum objective value.
 8. A method for determining routing in a network comprising a plurality of nodes interconnected through links, to achieve an objective value within a prescribed bound from its minimum value, the method comprising: setting costs for each link in the network; initializing primary and secondary flows for each link to at least one predetermined value; selecting a commodity, said commodity comprising a source-sink pair and having a demand; routing a demand through the network for the selected commodity; updating costs for links over which the demand is routed, wherein said updating is based on said primary flows and said secondary flows; and performing the steps of selecting, routing, and updating until a value of an objective function is at least as much as a prescribed bound of a pre-determined value.
 9. The method of claim 8, wherein the step of performing the steps of selecting, routing, and updating until a value of an objective function is at least as much as a prescribed bound of a pre-determined value further comprises the step of performing the steps of selecting, routing, and updating until an approximate solution to the network routing is within a predetermined error from an optimum network routing.
 10. The method of claim 8, wherein the objective function is a dual objective function.
 11. The method of claim 10, wherein the dual objective function is part of a linear program designed to maximize a first variable of the dual objective function subject to a first plurality of conditions.
 12. The method of claim 11, wherein there is also a second objective function as part of a second linear program, the second linear program designed to minimize a variable of the second objective function subject to a second plurality of conditions, and wherein the method further comprises the step of using the second objective function to determine if the value of the dual objective function is correct.
 13. The method of claim 8, wherein the step of updating costs further comprises the step of, for each of a plurality of failure conditions and for each link over which demand is routed, updating costs using an exponential function.
 14. The method of claim 13, wherein the step of updating costs using an exponential function further comprises the steps of: determining if the primary flow is part of a set of paths affected by the failure condition; for all links that are part of the primary flow, updating costs for these primary flow links using the exponential function when the primary flow is part of a set of paths affected by the failure condition; and for all links that are part of the secondary flow, updating costs for these secondary flow links using the exponential function when the primary flow is part of a set of paths affected by the failure condition.
 15. The method of claim 13, wherein the exponential function is the following: e^(εu/u(e)), wherein ε is the predetermined error, u is an amount of flow currently routed on a link, and u(e) is a capacity of the link.
 16. The method of claim 8, wherein the step of routing demand through the network for the selected commodity further comprises the steps of: for each link over which demand is routed, determining an amount of demand to route on the link; increasing primary flow by the determined demand; and increasing secondary flow by the determined demand.
 17. The method of claim 16, wherein the determined demand is selected by selecting a minimum of one of the following: demand for the commodity; a capacity of a primary amount of demand; and a capacity of a secondary amount of demand.
 18. The method of claim 8, wherein the step of setting costs for each link in the network further comprises the step of setting costs for each link in the network by setting a cost for a link equal to a predetermined delta value divided by a capacity of the link.
 19. The method of claim 18, wherein the predetermined delta value is the following: (m|Q|/(1−ε))^(−1/ε), where m is a number of links in the network, |Q| is a number of failure conditions, and ε is the predetermined error.
 20. The method of claim 8, further comprising the steps of setting a desired budget and setting a current budget to a predetermined budget, and wherein the step of performing the steps of selecting, routing, and updating until a value of an objective function is at least as much as a prescribed bound of a pre-determined value further comprises the steps of selecting, routing, updating, and modifying the current budget until the value of the objective function is at least as much as the pre-determined value.
 21. An apparatus for determining routing in a network to achieve an objective value within a prescribed bound from its minimum value, the network comprising a plurality of nodes interconnected through links, the apparatus comprising: a memory that stores computer-readable code; a processor operatively coupled to the memory, the processor configured to execute the computer-readable code, the computer-readable code configured to: concurrently route demands for each of a plurality of commodities on a set of paths having a minimum cost with respect to an iteratively changing cost function, each set of paths comprising at least one primary path and a secondary path, wherein each of the demands is routed from a primary path to a secondary path of the set during a failure; adjust link costs using an exponential function based on an amount of flow through links over which each demand is routed and based on said at least one primary path and said secondary path; perform the step of adjusting for each of a number of potential failures; and iterate the steps of routing, adjusting, and performing until an objective value is reached which is within a prescribed bound of a pre-determined value, such that flow for each of the links in the network is determined.
 22. An article of manufacture for determining routing in a network to achieve an objective value within a prescribed bound from its minimum value, the network comprising a plurality of nodes interconnected through links, the article of manufacture comprising: a computer-readable medium having computer-readable code embodied thereon which when executed implement the steps of: concurrently routing demands for each of a plurality of commodities on a set of paths having a minimum cost with respect to an iteratively changing cost function, each set of paths comprising at least one primary path and a secondary path, wherein each of the demands is routed from a primary path to a secondary path of the set during a failure; adjusting link costs using an exponential function based on an amount of flow through links over which each demand is routed and based on said at least one primary path and said secondary path; performing the step of adjusting for each of a number of potential failures; and iterating the steps of routing, adjusting, and performing until an objective value is reached which is within a prescribed bound of a pre-determined value, such that flow for each of the links in the network is determined. 